Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 85
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Comput Biol Med ; 174: 108438, 2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38613893

RESUMO

BACKGROUND: Angiogenesis plays a vital role in the pathogenesis of several human diseases, particularly in the case of solid tumors. In the realm of cancer treatment, recent investigations into peptides with anti-angiogenic properties have yielded encouraging outcomes, thereby creating a hopeful therapeutic avenue for the treatment of cancer. Therefore, correctly identifying the anti-angiogenic peptides is extremely important in comprehending their biophysical and biochemical traits, laying the groundwork for uncovering novel drugs to combat cancer. METHODS: In this work, we present a novel ensemble-learning-based model, Stack-AAgP, specifically designed for the accurate identification and interpretation of anti-angiogenic peptides (AAPs). Initially, a feature representation approach is employed, generating 24 baseline models through six machine learning algorithms (random forest [RF], extra tree classifier [ETC], extreme gradient boosting [XGB], light gradient boosting machine [LGBM], CatBoost, and SVM) and four feature encodings (pseudo-amino acid composition [PAAC], amphiphilic pseudo-amino acid composition [APAAC], composition of k-spaced amino acid pairs [CKSAAP], and quasi-sequence-order [QSOrder]). Subsequently, the output (predicted probabilities) from 24 baseline models was inputted into the same six machine-learning classifiers to generate their respective meta-classifiers. Finally, the meta-classifiers were stacked together using the ensemble-learning framework to construct the final predictive model. RESULTS: Findings from the independent test demonstrate that Stack-AAgP outperforms the state-of-the-art methods by a considerable margin. Systematic experiments were conducted to assess the influence of hyperparameters on the proposed model. Our model, Stack-AAgP, was evaluated on the independent NT15 dataset, revealing superiority over existing predictors with an accuracy improvement ranging from 5% to 7.5% and an increase in Matthews Correlation Coefficient (MCC) from 7.2% to 12.2%.

2.
Int J Mol Sci ; 25(7)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38612558

RESUMO

Cruzipain inhibitors are required after medications to treat Chagas disease because of the need for safer, more effective treatments. Trypanosoma cruzi is the source of cruzipain, a crucial cysteine protease that has driven interest in using computational methods to create more effective inhibitors. We employed a 3D-QSAR model, using a dataset of 36 known inhibitors, and a pharmacophore model to identify potential inhibitors for cruzipain. We also built a deep learning model using the Deep purpose library, trained on 204 active compounds, and validated it with a specific test set. During a comprehensive screening of the Drug Bank database of 8533 molecules, pharmacophore and deep learning models identified 1012 and 340 drug-like molecules, respectively. These molecules were further evaluated through molecular docking, followed by induced-fit docking. Ultimately, molecular dynamics simulation was performed for the final potent inhibitors that exhibited strong binding interactions. These results present four novel cruzipain inhibitors that can inhibit the cruzipain protein of T. cruzi.


Assuntos
Doença de Chagas , Cisteína Endopeptidases , Humanos , Simulação de Acoplamento Molecular , Proteínas de Protozoários , Doença de Chagas/tratamento farmacológico , Desenho de Fármacos
3.
Arch Toxicol ; 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38619593

RESUMO

Cytochrome P450 enzymes are a superfamily of enzymes responsible for the metabolism of a variety of medicines and xenobiotics. Among the Cytochrome P450 family, five isozymes that include 1A2, 2C9, 2C19, 2D6, and 3A4 are most important for the metabolism of xenobiotics. Inhibition of any of these five CYP isozymes causes drug-drug interactions with high pharmacological and toxicological effects. So, the inhibition or non-inhibition prediction of these isozymes is of great importance. Many techniques based on machine learning and deep learning algorithms are currently being used to predict whether these isozymes will be inhibited or not. In this study, three different molecular or substructural properties that include Morgan, MACCS and Morgan (combined) and RDKit of the various molecules are used to train a distinct SVM model against each isozyme (1A2, 2C9, 2C19, 2D6, and 3A4). On the independent dataset, Morgan fingerprints provided the best results, while MACCS and Morgan (combined) achieved comparable results in terms of balanced accuracy (BA), sensitivity (Sn), and Mathews correlation coefficient (MCC). For the Morgan fingerprints, balanced accuracies (BA), Mathews correlation coefficients (MCC), and sensitivities (Sn) against each CYPs isozyme, 1A2, 2C9, 2C19, 2D6, and 3A4 on an independent dataset ranged between 0.81 and 0.85, 0.61 and 0.70, 0.72 and 0.83, respectively. Similarly, on the independent dataset, MACCS and Morgan (combined) fingerprints achieved competitive results in terms of balanced accuracies (BA), Mathews correlation coefficients (MCC), and sensitivities (Sn) against each CYPs isozyme, 1A2, 2C9, 2C19, 2D6, and 3A4, which ranged between 0.79 and 0.85, 0.59 and 0.69, 0.69 and 0.82, respectively.

4.
Biosystems ; 237: 105177, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38458346

RESUMO

The escalating global incidence of cancer poses significant health challenges, underscoring the need for innovative and more efficacious treatments. Cancer immunotherapy, a promising approach leveraging the body's immune system against cancer, emerges as a compelling solution. Consequently, the identification and characterization of tumor T-cell antigens (TTCAs) have become pivotal for exploration. In this manuscript, we introduce TTCA-IF, an integrative machine learning-based framework designed for TTCAs identification. TTCA-IF employs ten feature encoding types in conjunction with five conventional machine learning classifiers. To establish a robust foundation, these classifiers are trained, resulting in the creation of 150 baseline models. The outputs from these baseline models are then fed back into the five classifiers, generating their respective meta-models. Through an ensemble approach, the five meta-models are seamlessly integrated to yield the final predictive model, the TTCA-IF model. Our proposed model, TTCA-IF, surpasses both baseline models and existing predictors in performance. In a comparative analysis involving nine novel peptide sequences, TTCA-IF demonstrated exceptional accuracy by correctly identifying 8 out of 9 peptides as TTCAs. As a tool for screening and pinpointing potential TTCAs, we anticipate TTCA-IF to be invaluable in advancing cancer immunotherapy.


Assuntos
Aprendizado de Máquina , Neoplasias , Humanos , Tiazolidinas , Linfócitos T , Neoplasias/terapia , Neoplasias/diagnóstico
5.
iScience ; 27(3): 109200, 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38420582

RESUMO

Remarkable and intelligent perovskite solar cells (PSCs) have attracted substantial attention from researchers and are undergoing rapid advancements in photovoltaic technology. These developments aim to create highly efficient energy devices with fewer dominant recombination losses within the realm of third-generation solar cells. Diverse machine learning (ML) algorithms implemented, addressing dominant losses due to recombination in PSCs, focusing on grain boundaries (GBs), interfaces, and band-to-band recombination. The extreme gradient boosting (XGBoost) classifier effectively predicts the recombination losses. Our model trained with 7-fold cross-validation to ensure generalizability and robustness. Leveraging Optuna and shapley additive explanations (SHAP) for hyperparameter optimization and investigate the influence of features on target variables, achieved 85% accuracy on over 2 million simulated data, respectively. Because of the input parameters (light intensity and open-circuit voltage), the performance evaluation measures for the dominant losses caused by the recombination predicted by proposed model were superior to those of state-of-the-art models.

6.
Comput Biol Med ; 169: 107925, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38183701

RESUMO

Serine phosphorylation plays a pivotal role in the pathogenesis of various cellular processes and diseases. Roughly 81% of human diseases have links to phosphorylation, and an overwhelming 86.4% of protein phosphorylation takes place at serine residues. In eukaryotes, over a quarter of proteins undergo phosphorylation, with more than half implicated in numerous disorders, notably cancer and reproductive system diseases. This study primarily focuses on serine-phosphorylation-driven pathogenesis and the critical role of conserved motif identification. While numerous techniques exist for predicting serine phosphorylation sites, traditional wet lab experiments are resource-intensive. Our paper introduces a cutting-edge deep learning tool for predicting S phosphorylation sites, integrating explainable AI for motif identification, a transformer language model, and deep neural network components. We trained our model on protein sequences from UniProt, validated it against the dbPTM benchmark dataset, and employed the PTMD dataset to explore motifs related to mammalian disorders. Our results highlight that our model surpasses other deep learning predictors by a significant 3%. Furthermore, we utilized the local interpretable model-agnostic explanations (LIME) approach to shed light on the predictions, emphasizing the amino acid residues crucial for S phosphorylation. Notably, our model also outperformed competitors in kinase-specific serine phosphorylation prediction on benchmark datasets.


Assuntos
Redes Neurais de Computação , Proteínas , Animais , Humanos , Fosforilação , Proteínas/metabolismo , Sequência de Aminoácidos , Serina/metabolismo , Mamíferos/metabolismo
7.
Comput Biol Med ; 170: 108007, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38242015

RESUMO

Drug combinations are frequently used to treat cancer to reduce side effects and increase efficacy. The experimental discovery of drug combination synergy is time-consuming and expensive for large datasets. Therefore, an efficient and reliable computational approach is required to investigate these drug combinations. Advancements in deep learning can handle large datasets with various biological problems. In this study, we developed a SynergyGTN model based on the Graph Transformer Network to predict the synergistic drug combinations against an untreated cancer cell line expression profile. We represent the drug via a graph, with each node and edge of the graph containing nine types of atomic feature vectors and four bonds features, respectively. The cell lines represent based on their gene expression profiles. The drug graph was passed through the GTN layers to extract a generalized feature map for each drug pairs. The drug pair extracted features and cell-line gene expression profiles were concatenated and subsequently subjected to processing through multiple densely connected layers. SynergyGTN outperformed the state-of-the-art methods, with a receiver operating characteristic area under the curve improvement of 5% on the 5-fold cross-validation. The accuracy of SynergyGTN was further verified through three types of cross-validation tests strategies namely leave-drug-out, leave-combination-out, and leave-tissue-out, resulting in improvement in accuracy of 8%, 1%, and 2%, respectively. The Astrazeneca Dream dataset was utilized as an independent dataset to validate and assess the generalizability of the proposed method, resulting in an improvement in balanced accuracy of 13%. In conclusion, SynergyGTN is a reliable and efficient computational approach for predicting drug combination synergy in cancer treatment. Finally, we developed a web server tool to facilitate the pharmaceutical industry and researchers, as available at: http://nsclbio.jbnu.ac.kr/tools/SynergyGTN/.


Assuntos
Biologia Computacional , Transcriptoma , Sinergismo Farmacológico , Biologia Computacional/métodos , Combinação de Medicamentos , Linhagem Celular Tumoral
8.
Int J Mol Sci ; 25(2)2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38255790

RESUMO

Computational methods play a pivotal role in the pursuit of efficient drug discovery, enabling the rapid assessment of compound properties before costly and time-consuming laboratory experiments. With the advent of technology and large data availability, machine and deep learning methods have proven efficient in predicting molecular solubility. High-precision in silico solubility prediction has revolutionized drug development by enhancing formulation design, guiding lead optimization, and predicting pharmacokinetic parameters. These benefits result in considerable cost and time savings, resulting in a more efficient and shortened drug development process. The proposed SolPredictor is designed with the aim of developing a computational model for solubility prediction. The model is based on residual graph neural network convolution (RGNN). The RGNNs were designed to capture long-range dependencies in graph-structured data. Residual connections enable information to be utilized over various layers, allowing the model to capture and preserve essential features and patterns scattered throughout the network. The two largest datasets available to date are compiled, and the model uses a simplified molecular-input line-entry system (SMILES) representation. SolPredictor uses the ten-fold split cross-validation Pearson correlation coefficient R2 0.79±0.02 and root mean square error (RMSE) 1.03±0.04. The proposed model was evaluated using five independent datasets. Error analysis, hyperparameter optimization analysis, and model explainability were used to determine the molecular features that were most valuable for prediction.


Assuntos
Desenvolvimento de Medicamentos , Descoberta de Drogas , Solubilidade , Correlação de Dados , Redes Neurais de Computação
9.
Mol Inform ; 43(2): e202300217, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38050743

RESUMO

Rapid and accurate prediction of bandgaps and efficiency of perovskite solar cells is a crucial challenge for various solar cell applications. Existing theoretical and experimental methods often accurately measure these parameters; however, these methods are costly and time-consuming. Machine learning-based approaches offer a promising and computationally efficient method to address this problem. In this study, we trained different machine learning(ML) models using previously reported experimental data. Among the different ML models, the CatBoostRegressor performed better for both bandgap and efficiency approximations. We evaluated the proposed model using k-fold cross-validation and investigated the relative importance of input features using Shapley Additive Explanations (SHAP). SHAP interprets valuable insights into feature contributions of the prediction of the proposed model. Furthermore, we validated the performance of the proposed model using an independent dataset, demonstrating its robustness and generalizability beyond the training data. Our findings show that machine learning-based approaches, with the aid of SHAP, can provide a promising and computationally efficient method for the accurate and rapid prediction of perovskite solar cell properties. The proposed model is expected to facilitate the discovery of new perovskite materials and is freely available at GitHub (https://github.com/AsadKhanJBNU/perovskite_bandgap_and_efficiency.git) for the perovskite community.


Assuntos
Compostos de Cálcio , Óxidos , Titânio , Aprendizado de Máquina
10.
Comput Biol Med ; 168: 107724, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-37989075

RESUMO

BACKGROUND: The most commonly used therapy currently for inflammatory and autoimmune diseases is nonspecific anti-inflammatory drugs, which have various hazardous side effects. Recently, some anti-inflammatory peptides (AIPs) have been found to be a substitute therapy for inflammatory diseases like rheumatoid arthritis and Alzheimer's. Therefore, the identification of these AIPs is an emerging topic that is equally important. METHODS: In this work, we have proposed an identification model for AIPs using a voting classifier. We used eight different feature descriptors and five conventional machine-learning classifiers. The eight feature encodings were concatenated to get a hybrid feature set. The five baseline models trained on the hybrid feature set were integrated via a voting classifier. Finally, a feature selection algorithm was used to select the optimal feature set for the construction of our final model, named IF-AIP. RESULTS: We tested the proposed model on two independent datasets. On independent data 1, the IF-AIP model shows an improvement of 3%-5.6% in terms of accuracies and 6.7%-10.8% in terms of MCC compared to the existing methods. On the independent dataset 2, our model IF-AIP shows an overall improvement of 2.9%-5.7% in terms of accuracy and 8.3%-8.6% in terms of MCC score compared to the existing methods. A comparative performance analysis was conducted between the proposed model and existing methods using a set of 24 novel peptide sequences. Notably, the IF-AIP method exhibited exceptional accuracy, correctly identifying all 24 peptides as AIPs. The source code, pre-trained models, and all datasets are made available at https://github.com/Mir-Saima/IF-AIP.


Assuntos
Aprendizado de Máquina , Peptídeos , Algoritmos , Anti-Inflamatórios/análise , Software
11.
Bioinformatics ; 39(11)2023 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-37929975

RESUMO

MOTIVATION: The origins of replication sites (ORIs) are precise regions inside the DNA sequence where the replication process begins. These locations are critical for preserving the genome's integrity during cell division and guaranteeing the faithful transfer of genetic data from generation to generation. The advent of experimental techniques has aided in the discovery of ORIs in many species. Experimentation, on the other hand, is often more time-consuming and pricey than computational approaches, and it necessitates specific equipment and knowledge. Recently, ORI sites have been predicted using computational techniques like motif-based searches and artificial intelligence algorithms based on sequence characteristics and chromatin states. RESULTS: In this article, we developed ORI-Explorer, a unique artificial intelligence-based technique that combines multiple feature engineering techniques to train CatBoost Classifier for recognizing ORIs from four distinct eukaryotic species. ORI-Explorer was created by utilizing a unique combination of three traditional feature-encoding techniques and a feature set obtained from a deep-learning neural network model. The ORI-Explorer has significantly outperformed current predictors on the testing dataset. Furthermore, by employing the sophisticated SHapley Additive exPlanation method, we give crucial insights that aid in comprehending model success, highlighting the most relevant features vital for forecasting cell-specific ORIs. ORI-Explorer is also intended to aid community-wide attempts in discovering potential ORIs and developing innovative verifiable biological hypotheses. AVAILABILITY AND IMPLEMENTATION: The used datasets along with the source code are made available through https://github.com/Z-Abbas/ORI-Explorer and https://zenodo.org/record/8358679.


Assuntos
Inteligência Artificial , Origem de Replicação , Replicação do DNA , Cromatina , Sequência de Bases
12.
J Mol Biol ; 435(23): 168314, 2023 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-37852600

RESUMO

Enhancers are DNA regions that are responsible for controlling the expression of genes. Enhancers are usually found upstream or downstream of a gene, or even inside a gene's intron region, but are normally located at a distant location from the genes they control. By integrating experimental and computational approaches, it is possible to uncover enhancers within DNA sequences, which possess regulatory properties. Experimental techniques such as ChIP-seq and ATAC-seq can identify genomic regions that are associated with transcription factors or accessible to regulatory proteins. On the other hand, computational techniques can predict enhancers based on sequence features and epigenetic modifications. In our study, we have developed a multi-classifier stacked ensemble (MCSE-enhancer) model that can accurately identify enhancers. We utilized feature descriptors from various physiochemical properties as input for our six baseline classifiers and built a stacked classifier, which outperformed previous enhancer classification techniques in terms of accuracy, specificity, sensitivity, and Mathew's correlation coefficient. Our model achieved an accuracy of 81.5%, representing a 2-3% improvement over existing models.


Assuntos
Biologia Computacional , Elementos Facilitadores Genéticos , Aprendizado de Máquina , Análise de Sequência de DNA , Biologia Computacional/métodos , DNA/química , DNA/genética , Fatores de Transcrição/química , Análise de Sequência de DNA/métodos
13.
J Chem Inf Model ; 63(20): 6198-6211, 2023 10 23.
Artigo em Inglês | MEDLINE | ID: mdl-37819031

RESUMO

Absorption is an important area of research in pharmacochemistry and drug development, because the drug has to be absorbed before any drug effects can occur. Furthermore, the ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) profile of drugs can be directly and considerably altered by modulating factors affecting absorption. Many drugs in development fail because of poor absorption. The research and continuous efforts of researchers in recent years have brought many successes and promises in drug absorption property prediction, especially in silico, which helps to reduce the time and cost significantly for screening undesirable drug candidates. In this report, we explicitly provide an overview of recent in silico studies on predicting absorption properties, especially from 2019 to the present, using artificial intelligence. Additionally, we have collected and investigated public databases that support absorption prediction research. On those grounds, we also proposed the challenges and development directions of absorption prediction in the future. We hope this review can provide researchers with valuable guidelines on absorption prediction to facilitate the development of newer approaches in drug discovery.


Assuntos
Inteligência Artificial , Descoberta de Drogas , Fenômenos Químicos , Bases de Dados Factuais
14.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37555812

RESUMO

MOTIVATION: The investigation of DNA methylation can shed light on the processes underlying human well-being and help determine overall human health. However, insufficient coverage makes it challenging to implement single-stranded DNA methylation sequencing technologies, highlighting the need for an efficient prediction model. Models are required to create an understanding of the underlying biological systems and to project single-cell (methylated) data accurately. RESULTS: In this study, we developed positional features for predicting CpG sites. Positional characteristics of the sequence are derived using data from CpG regions and the separation between nearby CpG sites. Multiple optimized classifiers and different ensemble learning approaches are evaluated. The OPTUNA framework is used to optimize the algorithms. The CatBoost algorithm followed by the stacking algorithm outperformed existing DNA methylation identifiers. AVAILABILITY AND IMPLEMENTATION: The data and methodologies used in this study are openly accessible to the research community. Researchers can access the positional features and algorithms used for predicting CpG site methylation patterns. To achieve superior performance, we employed the CatBoost algorithm followed by the stacking algorithm, which outperformed existing DNA methylation identifiers. The proposed iCpG-Pos approach utilizes only positional features, resulting in a substantial reduction in computational complexity compared to other known approaches for detecting CpG site methylation patterns. In conclusion, our study introduces a novel approach, iCpG-Pos, for predicting CpG site methylation patterns. By focusing on positional features, our model offers both accuracy and efficiency, making it a promising tool for advancing DNA methylation research and its applications in human health and well-being.


Assuntos
Biologia Computacional , Biologia Computacional/métodos , Análise de Célula Única , Sequenciamento Completo do Genoma , Metilação de DNA
15.
Methods ; 217: 49-56, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37454743

RESUMO

The cytokine interleukin-4 (IL-4) plays an important role in our immune system. IL-4 leads the way in the differentiation of naïve T-helper 0 cells (Th0) to T-helper 2 cells (Th2). The Th2 responses are characterized by the release of IL-4. CD4+ T cells produce the cytokine IL-4 in response to exogenous parasites. IL-4 has a critical role in the growth of CD8+ cells, inflammation, and responses of T-cells. We propose an ensemble model for the prediction of IL-4 inducing peptides. Four feature encodings were extracted to build an efficient predictor: pseudo-amino acid composition, amphiphilic pseudo-amino acid composition, quasi-sequence-order, and Shannon entropy. We developed an ensemble learning model fusion of random forest, extreme gradient boost, light gradient boosting machine, and extra tree classifier in the first layer, and a Gaussian process classifier as a meta classifier in the second layer. The outcome of the benchmarking testing dataset, with a Matthews correlation coefficient of 0.793, showed that the meta-model (Meta-IL4) outperformed individual classifiers. The highest accuracy achieved by the Meta-IL4 model is 90.70%. These findings suggest that peptides that induce IL-4 can be predicted with reasonable accuracy. These models could aid in the development of peptides that trigger the appropriate Th2 response.


Assuntos
Interleucina-4 , Peptídeos , Citocinas , Aminoácidos , Aprendizado de Máquina
16.
Comput Biol Med ; 164: 107242, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37473564

RESUMO

MicroRNAs (miRNAs) are small non-coding RNA molecules that play a crucial role in regulating gene expression at the post-transcriptional level by binding to potential target sites of messenger RNAs (mRNAs), facilitated by the Argonaute family of proteins. Selecting the conservative candidate target sites (CTS) is a challenging step, considering that most of the existing computational algorithms primarily focus on canonical site types, which is a time-consuming and inefficient utilization of miRNA target site interactions. We developed a stacking classifier algorithm that addresses the CTS selection criteria using feature-encoding techniques that generates feature vectors, including k-mer nucleotide composition, dinucleotide composition, pseudo-nucleotide composition, and sequence order coupling. This innovative stacking classifier algorithm surpassed previous state-of-the-art algorithms in predicting functional miRNA targets. We evaluated the performance of the proposed model on 10 independent test datasets and obtained an average accuracy of 79.77%, which is a significant improvement of 7.26 % over previous models. This improvement shows that the proposed method has great potential for distinguishing highly functional miRNA targets and can serve as a valuable tool in biomedical and drug development research.


Assuntos
MicroRNAs , MicroRNAs/genética , MicroRNAs/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Algoritmos , Biologia Computacional/métodos
17.
Mol Ther ; 31(8): 2543-2551, 2023 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-37271991

RESUMO

5-methylcytosine (m5C) is indeed a critical post-transcriptional alteration that is widely present in various kinds of RNAs and is crucial to the fundamental biological processes. By correctly identifying the m5C-methylation sites on RNA, clinicians can more clearly comprehend the precise function of these m5C-sites in different biological processes. Due to their effectiveness and affordability, computational methods have received greater attention over the last few years for the identification of methylation sites in various species. To precisely identify RNA m5C locations in five different species including Homo sapiens, Arabidopsis thaliana, Mus musculus, Drosophila melanogaster, and Danio rerio, we proposed a more effective and accurate model named m5C-pred. To create m5C-pred, five distinct feature encoding techniques were combined to extract features from the RNA sequence, and then we used SHapley Additive exPlanations to choose the best features among them, followed by XGBoost as a classifier. We applied the novel optimization method called Optuna to quickly and efficiently determine the best hyperparameters. Finally, the proposed model was evaluated using independent test datasets, and we compared the results with the previous methods. Our approach, m5C- pred, is anticipated to be useful for accurately identifying m5C sites, outperforming the currently available state-of-the-art techniques.


Assuntos
Drosophila melanogaster , RNA , Animais , Camundongos , RNA/genética , Drosophila melanogaster/genética , Sequência de Bases
18.
Methods ; 218: 14-24, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37385419

RESUMO

Healthy sleep is vital to all functions in the body. It improves physical and mental health, strengthens resistance against diseases, and develops strong immunity against metabolism and chronic diseases. However, a sleep disorder can cause the inability to sleep well. Sleep apnea syndrome is a critical breathing disorder that occurs during sleeping when breathing stops suddenly and starts when awake, causing sleep disturbance. If it is not treated timely, it can produce loud snoring and drowsiness or causes more acute health problems such as high blood pressure or heart attack. The accepted standard for diagnosing sleep apnea syndrome is full-night polysomnography. However, its limitations include a high cost and inconvenience. This article aims to develop an intelligent monitoring framework for detecting breathing events based on Software Defined Radio Frequency (SDRF) sensing and verify its feasibility for diagnosing sleep apnea syndrome. We extract the wireless channel state information (WCSI) for breathing motion using channel frequency response (CFR) recorded in time at every instant at the receiver. The proposed approach simplifies the receiver structure with the added functionality of communication and sensing together. Initially, simulations are conducted to test the feasibility of the SDRF sensing design for the simulated wireless channel. Then, a real-time experimental setup is developed in a lab environment to address the challenges of the wireless channel. We conducted 100 experiments to collect the dataset of 25 subjects for four breathing patterns. SDRF sensing system accurately detected breathing events during sleep without subject contact. The developed intelligent framework uses machine learning classifiers to classify sleep apnea syndrome and other breathing patterns with an acceptable accuracy of 95.9%. The developed framework aims to build a non-invasive sensing system to diagnose patients conveniently suffering from sleep apnea syndrome. Furthermore, this framework can easily be further extended for E-health applications.


Assuntos
Síndromes da Apneia do Sono , Humanos , Síndromes da Apneia do Sono/diagnóstico , Polissonografia , Software
19.
Comput Biol Med ; 163: 107132, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37343468

RESUMO

Retinal vessel segmentation is an important task in medical image analysis and has a variety of applications in the diagnosis and treatment of retinal diseases. In this paper, we propose SegR-Net, a deep learning framework for robust retinal vessel segmentation. SegR-Net utilizes a combination of feature extraction and embedding, deep feature magnification, feature precision and interference, and dense multiscale feature fusion to generate accurate segmentation masks. The model consists of an encoder module that extracts high-level features from the input images and a decoder module that reconstructs the segmentation masks by combining features from the encoder module. The encoder module consists of a feature extraction and embedding block that enhances by dense multiscale feature fusion, followed by a deep feature magnification block that magnifies the retinal vessels. To further improve the quality of the extracted features, we use a group of two convolutional layers after each DFM block. In the decoder module, we utilize a feature precision and interference block and a dense multiscale feature fusion block (DMFF) to combine features from the encoder module and reconstruct the segmentation mask. We also incorporate data augmentation and pre-processing techniques to improve the generalization of the trained model. Experimental results on three fundus image publicly available datasets (CHASE_DB1, STARE, and DRIVE) demonstrate that SegR-Net outperforms state-of-the-art models in terms of accuracy, sensitivity, specificity, and F1 score. The proposed framework can provide more accurate and more efficient segmentation of retinal blood vessels in comparison to the state-of-the-art techniques, which is essential for clinical decision-making and diagnosis of various eye diseases.


Assuntos
Aprendizado Profundo , Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Vasos Retinianos/diagnóstico por imagem , Fundo de Olho
20.
J Chem Inf Model ; 63(9): 2628-2643, 2023 05 08.
Artigo em Inglês | MEDLINE | ID: mdl-37125780

RESUMO

Toxicity prediction is a critical step in the drug discovery process that helps identify and prioritize compounds with the greatest potential for safe and effective use in humans, while also reducing the risk of costly late-stage failures. It is estimated that over 30% of drug candidates are discarded owing to toxicity. Recently, artificial intelligence (AI) has been used to improve drug toxicity prediction as it provides more accurate and efficient methods for identifying the potentially toxic effects of new compounds before they are tested in human clinical trials, thus saving time and money. In this review, we present an overview of recent advances in AI-based drug toxicity prediction, including the use of various machine learning algorithms and deep learning architectures, of six major toxicity properties and Tox21 assay end points. Additionally, we provide a list of public data sources and useful toxicity prediction tools for the research community and highlight the challenges that must be addressed to enhance model performance. Finally, we discuss future perspectives for AI-based drug toxicity prediction. This review can aid researchers in understanding toxicity prediction and pave the way for new methods of drug discovery.


Assuntos
Algoritmos , Inteligência Artificial , Humanos , Aprendizado de Máquina , Bioensaio , Descoberta de Drogas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...